Fitting New Speakers Based on a Short Untranscribed Sample
نویسندگان
چکیده
Learning-based Text To Speech systems have the potential to generalize from one speaker to the next and thus require a relatively short sample of any new voice. However, this promise is currently largely unrealized. We present a method that is designed to capture a new speaker from a short untranscribed audio sample. This is done by employing an additional network that given an audio sample, places the speaker in the embedding space. This network is trained as part of the speech synthesis system using various consistency losses. Our results demonstrate a greatly improved performance on both the dataset speakers, and, more importantly, when fitting new voices, even from very short samples.
منابع مشابه
Improving Speaker-Independent Lipreading with Domain-Adversarial Training
We present a Lipreading system, i.e. a speech recognition system using only visual features, which uses domain-adversarial training for speaker independence. Domain-adversarial training is integrated into the optimization of a lipreader based on a stack of feedforward and LSTM (Long Short-Term Memory) recurrent neural networks, yielding an end-to-end trainable system which only requires a very ...
متن کاملSignal detection Using Rational Function Curve Fitting
In this manuscript, we proposed a new scheme in communication signal detection which is respect to the curve shape of received signal and based on the extraction of curve fitting (CF) features. This feature extraction technique is proposed for signal data classification in receiver. The proposed scheme is based on curve fitting and approximation of rational fraction coefficients. For each symbo...
متن کاملInvestigation of Semi-Supervised Acoustic Model Training Based on the Committee of Heterogeneous Neural Networks
This paper investigates the semi-supervised training for deep neural network-based acoustic models (AM). In the conventional self-learning approach, a “seed-AM” is first trained by using a small transcribed data set. Then, a large untranscribed data set is decoded by using the seed-AM to create a transcription, which is finally used to train a new AM on the entire data. Our investigation in thi...
متن کاملBayesian Sample size Determination for Longitudinal Studies with Continuous Response using Marginal Models
Introduction Longitudinal study designs are common in a lot of scientific researches, especially in medical, social and economic sciences. The reason is that longitudinal studies allow researchers to measure changes of each individual over time and often have higher statistical power than cross-sectional studies. Choosing an appropriate sample size is a crucial step in a successful study. A st...
متن کاملModeling the Transport and Volumetric Properties of Solutions Containing Polymer and Electrolyte with New Model
A new theoretical model based on the local composition concept (TNRF-mNRTL model) was proposed to express the short-range contribution of the excess Gibbs energy for the solutions containing polymer and electrolyte. This contribution of interaction along with the long-range contribution of interaction (Pitzer-Debye-Hückel equation), configurational entropy of mixing (Flory-Huggins relation)...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1802.06984 شماره
صفحات -
تاریخ انتشار 2018